Machine Translation for Human Translators

نویسنده

  • Michael Denkowski
چکیده

While machine translation is sometimes sufficient for conveying information across language barriers, many scenarios still require precise human-quality translation that MT is currently unable to deliver. Governments and international organizations such as the United Nations require accurate translations of content dealing with complex geopolitical issues. Community-driven projects such as Wikipedia rely on volunteer translators to bring accurate information to diverse language communities. As the amount of data requiring translation has continued to increase, the idea of using machine translation to improve the speed of human translation has gained interest. In the frequently employed practice of post-editing, a machine translation system outputs an initial translation and a human translator edits it for correctness, ideally saving time over translating from scratch. While general improvements in MT quality have led to productivity gains with this technique, there has been little work on designing translation systems specifically for post-editing. In this work, we propose improvements to key components of statistical machine translation systems aimed at directly reducing the amount of work required from human translators. We propose casting MT for post-editing as an online learning task where new training instances are created as humans edit system output, introducing an online translation model that immediately learns from post-editor feedback. We propose an extended translation feature set that allows this model to learn from multiple translation contexts over time as data sources become more reliable. We propose an automatic evaluation metric that scores hypothesis-reference pairs according to several statistics that are directly interpretable as measuring of postediting effort. Our metric can be used to optimize translation systems in scenarios where standard metrics break down, select optimal system configurations for post-editing, and provide insight into the properties of translation quality that are most important for minimizing editing effort. Our online translation models and evaluation metrics are compatible with standard decoders and optimization algorithms. To evaluate the impact of our post-editing-targeted translation system, we propose a series of experiments that use a web-based framework to collect several types of highly accurate data from human translators. We discuss MT for post-editing as a distinct task and present the results of initial post-editing experiments. We finally outline an experimental setup for collecting valuable data that will be used to evaluate the impact of our online translation models and optimization metrics on human editing requirements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود و توسعه یک سیستم مترجم‌یار انگلیسی به فارسی

In recent years, significant improvements have been achieved in statistical machine translation (SMT), but still even the best machine translation technology is far from replacing or even competing with human translators. Another way to increase the productivity of the translation process is computer-assisted translation (CAT) system. In a CAT system, the human translator begins to type the tra...

متن کامل

Composing Human and Machine Translation Services: Language Grid for Improving Localization Processes

With the development of the Internet environments, more and more language services become accessible for common people. However, the gap between human translators and machine translators remains huge especially for the domain of localization processes that requires high translation quality. Although efforts of combining human and machine translators for supporting multilingual communication hav...

متن کامل

Translation Technology Tools and Professional Translators’ Attitudes toward Them

Today technology is an integral part of professional translation; and it is generally assumed that translators’ attitudes toward translation technology tools influence their interaction with technology (Bundgaard, 2017). Therefore, the present two-phase study seeks to shed some light on what translation technology tools are and how professional translators feel toward them. The research method ...

متن کامل

Human translation and machine translation

While most of recent machine translation work has focus on the gisting application (i.e., translating web pages), another important application is to aid human translators. To build better computer aided translation tools, we first need to understand how human translators work. We discuss how human translators work and what tools they typically use. We also build a novel tool that offers post-e...

متن کامل

Interactive Assistance to Human Translators using Statistical Machine Translation Methods

We investigate novel types of assistance for human translators, based on statistical machine translation methods. We developed Caitra, a tool that makes suggestions for sentence completion, shows word and phrase translation options, and allows postediting of machine translation output. A user study validates the types of assistance and provides insight into the human translation process.

متن کامل

A tool for rapid manual translation

There have been several attempts to realize the idea of a fully automatic translation system for text translation to replace human translators. By contrast, little work has been put into building tools to aid human translators. This report describes the ideas behind such a tool. The tool is intended to aid human translators in achieving higher productivity and better quality, by presenting term...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013